939 research outputs found
Worst-case convergence analysis of inexact gradient and Newton methods through semidefinite programming performance estimation
We provide new tools for worst-case performance analysis of the gradient (or
steepest descent) method of Cauchy for smooth strongly convex functions, and
Newton's method for self-concordant functions, including the case of inexact
search directions. The analysis uses semidefinite programming performance
estimation, as pioneered by Drori and Teboulle [Mathematical Programming,
145(1-2):451-482, 2014], and extends recent performance estimation results for
the method of Cauchy by the authors [Optimization Letters, 11(7), 1185-1199,
2017]. To illustrate the applicability of the tools, we demonstrate a novel
complexity analysis of short step interior point methods using inexact search
directions. As an example in this framework, we sketch how to give a rigorous
worst-case complexity analysis of a recent interior point method by Abernethy
and Hazan [PMLR, 48:2520-2528, 2016].Comment: 22 pages, 1 figure. Title of earlier version was "Worst-case
convergence analysis of gradient and Newton methods through semidefinite
programming performance estimation
On the oracle complexity of smooth strongly convex minimization
We construct a family of functions suitable for establishing lower bounds on
the oracle complexity of first-order minimization of smooth strongly-convex
functions. Based on this construction, we derive new lower bounds on the
complexity of strongly-convex minimization under various inaccuracy criteria.
The new bounds match the known upper bounds up to a constant factor, and when
the inaccuracy of a solution is measured by its distance to the solution set,
the new lower bound exactly matches the upper bound obtained by the recent
Information-Theoretic Exact Method by the same authors, thereby establishing
the exact oracle complexity for this class of problems
Efficient First-order Methods for Convex Minimization: a Constructive Approach
We describe a novel constructive technique for devising efficient first-order
methods for a wide range of large-scale convex minimization settings, including
smooth, non-smooth, and strongly convex minimization. The technique builds upon
a certain variant of the conjugate gradient method to construct a family of
methods such that a) all methods in the family share the same worst-case
guarantee as the base conjugate gradient method, and b) the family includes a
fixed-step first-order method. We demonstrate the effectiveness of the approach
by deriving optimal methods for the smooth and non-smooth cases, including new
methods that forego knowledge of the problem parameters at the cost of a
one-dimensional line search per iteration, and a universal method for the union
of these classes that requires a three-dimensional search per iteration. In the
strongly convex case, we show how numerical tools can be used to perform the
construction, and show that the resulting method offers an improved worst-case
bound compared to Nesterov's celebrated fast gradient method.Comment: Accepted in Mathematical Programming
(https://doi.org/10.1007/s10107-019-01410-2). Code available on GitHub
(https://github.com/AdrienTaylor/GreedyMethods
Principled Analyses and Design of First-Order Methods with Inexact Proximal Operators
Proximal operations are among the most common primitives appearing in both
practical and theoretical (or high-level) optimization methods. This basic
operation typically consists in solving an intermediary (hopefully simpler)
optimization problem. In this work, we survey notions of inaccuracies that can
be used when solving those intermediary optimization problems. Then, we show
that worst-case guarantees for algorithms relying on such inexact proximal
operations can be systematically obtained through a generic procedure based on
semidefinite programming. This methodology is primarily based on the approach
introduced by Drori and Teboulle (Mathematical Programming, 2014) and on convex
interpolation results, and allows producing non-improvable worst-case analyzes.
In other words, for a given algorithm, the methodology generates both
worst-case certificates (i.e., proofs) and problem instances on which those
bounds are achieved.
Relying on this methodology, we provide three new methods with conceptually
simple proofs: (i) an optimized relatively inexact proximal point method, (ii)
an extension of the hybrid proximal extragradient method of Monteiro and
Svaiter (SIAM Journal on Optimization, 2013), and (iii) an inexact accelerated
forward-backward splitting supporting backtracking line-search, and both (ii)
and (iii) supporting possibly strongly convex objectives. Finally, we use the
methodology for studying a recent inexact variant of the Douglas-Rachford
splitting due to Eckstein and Yao (Mathematical Programming, 2018).
We showcase and compare the different variants of the accelerated inexact
forward-backward method on a factorization and a total variation problem.Comment: Minor modifications including acknowledgments and references. Code
available at https://github.com/mathbarre/InexactProximalOperator
Provable non-accelerations of the heavy-ball method
In this work, we show that the heavy-ball (\HB) method provably does not
reach an accelerated convergence rate on smooth strongly convex problems. More
specifically, we show that for any condition number and any choice of
algorithmic parameters, either the worst-case convergence rate of \HB on the
class of -smooth and -strongly convex \textit{quadratic} functions is
not accelerated (that is, slower than ), or there
exists an -smooth -strongly convex function and an initialization such
that the method does not converge.
To the best of our knowledge, this result closes a simple yet open question
on one of the most used and iconic first-order optimization technique.
Our approach builds on finding functions for which \HB fails to converge
and instead cycles over finitely many iterates. We analytically describe all
parametrizations of \HB that exhibit this cycling behavior on a particular
cycle shape, whose choice is supported by a systematic and constructive
approach to the study of cycling behaviors of first-order methods. We show the
robustness of our results to perturbations of the cycle, and extend them to
class of functions that also satisfy higher-order regularity conditions
A note on approximate accelerated forward-backward methods with absolute and relative errors, and possibly strongly convex objectives
In this short note, we provide a simple version of an accelerated
forward-backward method (a.k.a. Nesterov's accelerated proximal gradient
method) possibly relying on approximate proximal operators and allowing to
exploit strong convexity of the objective function. The method supports both
relative and absolute errors, and its behavior is illustrated on a set of
standard numerical experiments. Using the same developments, we further provide
a version of the accelerated proximal hybrid extragradient method of Monteiro
and Svaiter (2013) possibly exploiting strong convexity of the objective
function.Comment: Minor modifications in notations and acknowledgments. These methods
were originally presented in arXiv:2006.06041v2. Code available at
https://github.com/mathbarre/StronglyConvexForwardBackwar
Acceleration Methods
This monograph covers some recent advances in a range of acceleration
techniques frequently used in convex optimization. We first use quadratic
optimization problems to introduce two key families of methods, namely momentum
and nested optimization schemes. They coincide in the quadratic case to form
the Chebyshev method. We discuss momentum methods in detail, starting with the
seminal work of Nesterov and structure convergence proofs using a few master
templates, such as that for optimized gradient methods, which provide the key
benefit of showing how momentum methods optimize convergence guarantees. We
further cover proximal acceleration, at the heart of the Catalyst and
Accelerated Hybrid Proximal Extragradient frameworks, using similar algorithmic
patterns. Common acceleration techniques rely directly on the knowledge of some
of the regularity parameters in the problem at hand. We conclude by discussing
restart schemes, a set of simple techniques for reaching nearly optimal
convergence rates while adapting to unobserved regularity parameters.Comment: Published in Foundation and Trends in Optimization (see
https://www.nowpublishers.com/article/Details/OPT-036
- âŠ